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Abstract 

The efficiency and accuracy of several time inte- 
gration schemes are investigated for the unsteady 
Navier-Stokes equations. This study focuses on the 
efficiency of higher-order Runge-Kutta schemes in 
comparison with the popular Backward Differenc- 
ing Formulations. For this comparison an unsteady 
two-dimensional laminar flow problem is chosen, i.e. 
flow around a circular cylinder at Re=1200. It is 
concluded that for realistic error tolerances (smaller 
than 1CT 1 ) fourth- and fifth-order Runge Kutta 
schemes are the most efficient. For reasons of robust- 
ness and computer storage, the fourth-order Runge- 
Kutta method is recommended. The efficiency of the 
fourth-order Runge-Kutta scheme exceeds that of 
second-order Backward Difference Formula (BDF2) 
by a factor of 2.5 at engineering error tolerance lev- 
els (10 _1 -10 -2 ). Efficiency gains are more dramatic 
at smaller tolerances. 

Introduction 

Due to constraints of computing costs, the devel- 
opment of numerical techniques for fluid flow simula- 
tions in the past has focused mainly on steady state 
calculations. However, many physical phenomena 
of interest are inherently unsteady; a few examples 
being separated flows, wake flows and buffet, fluid 
actuators and maneuvering. With the continuous re- 
duction of computer costs recently more attention is 
devoted to the simulation of these unsteady flows 1 , 4 . 
However, the need for further reduction of computer 
time for unsteady flow computations is still appar- 
ent. Therefore, in this paper we investigate possible 
reductions in computer time due to the choice of 
an efficient time integration scheme from a series of 
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schemes ranging from first to fifth order. A similar 
study was performed for only first and second order 
time integration methods by Marx 11 . 

Implicit in any comparison of efficiency is a precise 
error tolerance requirement. Unfortunately, seldom 
does a single scheme prove to be optimal over a wide 
range of solution error tolerances. It is well known 
that high-order schemes (fourth-, fifth-, etc.) out- 
perform low order schemes for error tolerances that 
are small (f« 1CV 7 ). For example, Kennedy 8 com- 
pares explicit third-, fourth- and fifth-order Runge- 
Kutta (RK) schemes on Direct Navier-Stokes simu- 
lations (DNS), and determines the optimal order for 
a given temporal error tolerance. For DNS it was 
found, the fourth-order methods are optimal over a 
surprisingly broad range of error tolerances, and are 
competitive at large error tolerances as well. 

The hallmark of large-scale aerodynamics calcula- 
tions is that they seldom require small error toler- 
ances. Calculations that are accurate to one or two 
significant digits, which translates into an error tol- 
erance of (10 _1 -10~ 2 ), are frequently sufficient. For 
these aerodynamics calculations, the second-order 
accurate BDF2 scheme is currently the method of 
choice. There is little question that calculations re- 
quiring low error tolerances (10“ 4 -10 -5 ) will be well 
suited for fourth-order RK formulations. The cen- 
tral question of this study, however, is the feasibility 
of using fourth-order RK formulations for simula- 
tions requiring error tolerances of (10 _1 -10 -2 ). 

A production aerodynamics solver is needed for 
this study. For this, the extensively tested and well 
documented solver of Vatsa (TLNS3D) 18 is chosen. 
This multi-block structured grid solver is representa- 
tive of a broad class of commonly used solvers. The 
TLNS3D (Thin-Layer Navier-Stokes 3-Dimensions) 
code utilizes a special form of the unsteady thin- 
layer Navier-Stokes equations. The spatial terms 
are discretized using a conventional cell-centered fi- 
nite volume scheme with artificial dissipation added 
for stability. Time is discretized in a fully implicit 
sense using both multistep BDF and multistage RK 
schemes. The resultant nonlinear algebraic equa- 
tions are solved iteratively in pseudo-time with a 
multi-grid acceleration used to speed up the conver- 
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gence to (pseudo-time) steady-state. 

In this paper, first, the governing flow equations 
and the space discretization are given. Thereafter, 
the time discretization techniques employed, i.e. 
several multi step Backward Differencing Formula- 
tions and multistage Runge-Kutta schemes, are ex- 
tensively discussed. This is followed by a description 
of the solution algorithm. That is, the implicit time 
integration of the flow equations and the iterative al- 
gorithm for solving the resulting non-linear implicit 
equation. Through a pseudo-time stability analysis 
the exact nature of the pseudo-time sub-iterations is 
analyzed. After validation of the space-time method 
for unsteady laminar flow around a circular cylin- 
der, accuracy and efficiency of the time integration 
schemes is discussed. 


Governing Equations 


In the present work, a modified version of the thin- 
layer Navier-Stokes equations is used to model the 
flow. The equation set is obtained from the com- 
plete Navier-Stokes equations by retaining only the 
viscous diffusion terms normal to the solid surfaces. 
For a body-fitted coordinate system (£,,r),Q fixed in 
time, these equations can be written in the conser- 
vative form as: 


d(U) fl(F-F v ) 

at di 

9(G- G v ) fl(H-H v ) 

drj d( 


where U represents a combination of the transforma- 
tion Jacobian J and the conserved variable vector. 
The vectors F, G, H, and F v , G v , H v represent the 
convective and diffusive fluxes in the three trans- 
formed coordinate directions, respectively. These 
equations represent a generalized form of the classi- 
cal thin-Layer Navier-Stokes equations and include 
all normal components of the viscous stress terms. 
The TLNS3D computer code, is used in this study 
to solve equation (1). Many references exist detail- 
ing the discretization and implementational issues of 
TLNS3D. We include only a brief summary of the 
general features, and refer to the work of Vat.sa 18 
for further details. 


Space Discretization 


The spatial terms in Equation (1) are discretized 
using a standard cell-centered finite volume scheme. 
The convection terms are discretized with second- 
order central differences with scalar/matrix artifi- 
cial dissipation (second- and fourth- difference dissi- 
pation) added to suppress the odd-even decoupling 


and oscillations in the vicinity of shock waves and 
stagnation points 18 . The viscous terms are cen- 
tral differenced with second-order formulas. The 
turbulence models used are Baldwin-Lomax 2 , and 
Spalart-Allmaras 15 . 

Time Discretization 

Consider the integration of the system of ordi- 
nary differential equations (ODE’s) represented by 
the equation 


dU 

dt 


S(*,U(*)) . 


The vector S in our case, results from the semi- 
discretization of the equations of fluid mechanics 
plus a suitable turbulence model. Inclusion of the 
turbulence model enables the simulation of high 
Reynolds number flows in excess of 10 7 , but indi- 
rectly introduces extremely fine length-scales in the 
near- wall regions. Near wall stiffness in the range of 
10 3 — 10 4 is not uncommon in practical engineering 
problems, and increases with Reynolds number. It 
is imperative that any efficient solver of these equa- 
tions be able to maintain stability at arbitrarily large 
time steps, thus allowing the potential to “step over” 
unimportant boundary layer time scales. 

The term “stiffness” is difficult to define. A prac- 
tical definition for the purpose of this work, com- 
pares an implicit timestep At 1 with a baseline ex- 
plicit timestep A t B . The implicit timestep is the 
maximum allowed by accuracy considerations, and 
the baseline explicit timestep is obtained from sta- 
bility consideration. All other variables are identi- 
cal in the comparison. The stiffness is the ratio of 
the two timesteps. The implicit scheme is capable of 
stepping over large negative eigenvalues that are not 
important for solution accuracy, while the explicit 
scheme must bound them in the finite stability en- 
velop. 

There are two mathematical properties that all 
candidate numerical integrators should possess. The 
first, (and most important) is the “A-stability” prop- 
erty which guarantees that all eigenvalues lying in 
the left half of the complex plane (LHP) will have 
an amplification of no more than 1, independent of 
the chosen step size. The only restriction on the 
time-step with an A-stable scheme is the consider- 
ation of solution accuracy. The second is the “L- 
stability” property which guarantees that eigenval- 
ues approaching — oo are damped in one time-step. 
These spurious eigenvalues are generated by high fre- 
quency information in the spatial discretization, and 
by incomplete solution of the non-linear system at 
each time-step. (The nonlinear system is never con- 


2 



verged to the levels of machine precision). If these 
spurious modes are not strongly damped they can 
build up and cause instability. 

Most numerical methods suitable for these compu- 
tations can be categorized into two broad classes: 1) 
multistep, and 2) multistage, each having its advan- 
tages and disadvantages. The current “methods of 
choice” in the computation of large scale engineering 
flows are the multistep BDF formulas, and in par- 
ticular the BDF2 scheme. These schemes achieve 
great efficiency because they solve only one non- 
linear set of equations per time-step. They suf- 
fer, however, from not being self-starting, are dif- 
ficult to use with variable time steps, and are not 
A-stable beyond second-order temporal accuracy. 
Multi-stage Runge-Kutta schemes require multiple 
nonlinear solves per time-step, but are self start- 
ing, are easily implemented in variable time-stepping 
mode, and can be designed with A- and L-stability 
properties for any temporal order. 

Practical experience indicates that large scale en- 
gineering computations are seldom stable if run with 
BDF4 12 . The BDF3 scheme is often stable, but di- 
verges for certain problems and some spatial opera- 
tors. These failures result from portions of the Left 
Half Plane that are not stable for arbitrarily large 
timesteps. Thus, a reasonable practitioner uses the 
BDF2 scheme exclusively for large scale computa- 
tions. The essential question this paper seeks to 
address is whether high-order RK schemes can be 
designed which are more efficient than the BDF2 
schemes, and if so, what is the optimal order of the 
RK scheme. 

The general formula for a fc-step BDF scheme can 
be written as 


&-i 

U(”+ fc ) = + {St)f3 k S in+k '>{2) 

i = 0 


The BDF formulas involve at each time-step the 
storage of k + 1 levels of the solution vector U, and 
the implicit solution of one set of nonlinear equa- 
tions. The BDF schemes are subject to the famous 
Dahlquist. barrier, which proves that A-st.ability is 
impossible for linear multistep schemes beyond sec- 
ond order. Table (1) presents the stability prop- 
erties and diagonal contribution 3k of several BDF 
schemes. 


Method 

Steps 

Order 

L(a) 

3k 

BDF1 

1 

1 

L(90°) 

1 

BDF2 

2 

2 

L{ 90°) 

2 

3 

BDF3 

3 

3 

£(86.03°) 

6 

11 

BDF4 

4 

4 

L{ 73.35°) 

12 

25 

BDF5 

5 

5 

£(51.84°) 

60 

137 

BDF6 

6 

6 

£(17.84°) 

60 

147 


Table 1. Properties of the BDF methods. 


The BDF1 and BDF2 schemes are L-stable for eigen- 
values anywhere in the left half of the complex plane. 
Beyond second order the BDF schemes are L (te- 
stable, with the a being defined in Table (1). This 
definition of stability implies that the schemes will 
be stable for eigenvalues lying in the wedge bounded 
above and below by the complex lines ±a. For fur- 
ther details see the work of Hairer 6 . 

The 

Runge-Kutta methods are multistage schemes and 
are implemented as 

k 

U* = U” + (At)J2 a ijS(U j ) , k = 1,« 

3 = 1 
s 

U n+1 _ u n + {At)J2 b jS (U j ) (3) 

3 = 1 
s 

U n+1 = U n + (AtfJ^bjS (IH), 
j=i 

where s is the number of stages and ctjj and bj are 
the Butcher coefficients of the scheme. The vectors 
U, and U are the p tft -order and (p — l) tft -order so- 
lutions at time level n + 1. The vector U is used 
solely for estimating error and is obtained with little 
additional computational overhead. In this work we 
choose to focus on the ESDIRK class of RK schemes, 
which refers to the .Explicit first stage, Single diag- 
onal coefficient, Diagonally Implicit Runge-A'utta. 
The Butcher tableau for these schemes (here repre- 
sented with s = 5) takes the form 


0 

0 

0 

0 

0 

0 

C 2 

a 2 i 

&hk 

0 

0 

0 

C 3 

a 3 i 

«32 

®kk 

0 

0 

C 4 

a 4 i 

<^42 

<343 

^kk 

0 

1 

b\ 

^2 

^3 

64 

<3 kk 


b\ 

^2 

^3 

64 

<3 kk 


bi 

^2 

^3 

64 

^5 


where c* denotes the point in the time interval 
t — > t+ At where the intermediate stage is evaluated, 
and bj are the coefficients used for the embedded er- 
ror predictor. The fully implicit RK schemes are not 
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pursued owing to the complexity of their implemen- 
tation in our general aerodynamics solver TLNS3D. 
The ESDIRK schemes are used rather than the stan- 
dard SDIRK schemes {an = 0, see Hairer 7 ) be- 
cause the ESDIRK schemes can achieve stage or- 
der 2 (each stage is at least second-order accurate), 
and the extra column of nonzero Butcher coefficients 
(an) allows more flexibility when designing new 
schemes. For all schemes, the “stiffly-accurate” as- 
sumption ( a s j = bj ) is enforced, which automatically 
extends A-stability into L-stability. In addition, it 
eliminates the potentially damaging explicit update 
U n+1 — U n + At X]j=i bjS^ from the algorithm, and 
replaces it with the condition U n+1 = U s . 

Numerous ESDIRK schemes were developed and 
tested in this study. Three schemes, one each of 
third-, fourth-, and fifth-order accuracy were chosen 
as being representative of each of these orders. No 
claim is made as to their optimality. Henceforth, 
they are referred to as ESDIRK3, ESDIRK4, and 
ESDIRK5, respectively. All three schemes are pre- 
sented in Kennedy 9 . The coefficients for the recom- 
mended ESDIRK4 scheme are included in the Ap- 
pendix. 


Solution Algorithm 

Discretizing equation (1) with an s-stage ESDIRK 
scheme represented in equation (4) yields for stage 
k, the expression 

^^ + E^Si = 0, *=!,. (4) 

3 = 1 

with 

o( «(— F + F v ) , 9(-G + G v ) 

s = — aj — + a, 

Cf(-H+H v ) 

d( 

The vector U fc is the solution at stage k, U n is the 
solution at the previous time level n, and a k j are 
the Butcher parameters for the RK-method used. 
Again note that in our case the last stage gives the 
solution at the new time level, that is U»+! = u s . 
Advancing the solution vector U fc from time level 
n — » ri + 1, requires solving the sequential set of 
s nonlinear algebraic equations defined in equation 

(4). 

Pseudo-Time Iterative Algorithm 

We follow here the work of Melson et, al. 12 
which was originally developed for a BDF algorithm. 
Equation (4) can be difficult to solve in its present 


form. Thus, the pseudo-time term is added to 
each stage k, yielding the expression 


dU fc U k - U" 
+ 


8t 


At 


+ J2 a kj$ 3 = 0, k = l,s. (5) 
j = i 


In this form, equation (5) is amenable to all the non- 
linear solving machinery available in TLNS3D. Each 
nonlinear equation is marched in pseudo-time with 
multi-grid acceleration until a predetermined con- 
vergence criterion is satisfied. 

Equation (5) can be rewritten in the form 

<9U fc U fc 

~ x — + ~rr + a kkS k + E = 0, k = l,s (6) 
or At 

with 


E = 


-U n 

At 


k - 1 

+ J2 a kjS J , 

3 = 1 


where E includes all the iteration independent in- 
formation at stage k of the RK scheme. Discretiz- 
ing equation (6) in the variable r using an Implicit- 
Explicit Runge-Kutta ( RK T ) scheme, yields 


U 


k p 


u k ° 


k p 


At 


+ G® 


,U' 

~At 


+ a kk S fc<P !) + E) = 0 


(7) 


where the p is the stage value of the RK r scheme. 
Note that the contribution from the time term is 
treated implicitly, the inviscid and viscous flux terms 
are treated explicitly, and that first-order temporal 
accuracy is sufficient for this scheme. Adding, sub- 
tracting, rearranging terms and accounting for resid- 
ual smoothing in equation (7) yields 


(1 + 


a P^ T 'jjfcp 

At ' 


U fcP 

-«»AU, r,[— 


jjfc 0 i a P^ T - [ jk p - 1 

At 

a kk S kPl + E], (8) 


where L~ ^ is the implicit residual smoothing matrix. 
Note that the bracketed term in equation (8) is still 
the residual of the physical time RK scheme. 

Pseudo-Time Stability Analysis 

Each grid level of the multi-grid process yields an 
equation similar to equation (8), with minor differ- 
ences in the variable E accounting for grid trans- 
fer contributions. The pseudo-time sub-iteration 
should converge rapidly, and any remaining residual 
at the end of the iteration should be devoid of high- 
frequency content. A Fourier stability analysis is 
used to analyze the exact nature of the pseudo-time 
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sub-iteration and to optimize the residual smooth- 
ing parameters. 

A necessary first test for the sub-iteration al- 
gorithm is good behavior for the 1-D wave equa- 
tion Ut + aU x = 0. Convergence and stability are 
strong functions of the discrete spatial operator U x . 
TLNS3D uses spatial discretizations that closely re- 
semble an approximation of the form 

dU_ _ U i+ 1 - Uj- 1 
dx 9 . /\ . 7 * 

U i+2 - + 6 Uj - 41 /,-! + 

+ 32A* 

The analysis begins by substituting this spatial op- 
erator into the 1-D wave equation on a periodic do- 
main, discretizing time using the ESDIRK scheme, 
and assuming solutions of the form U(x,t) = 
u(t)e l Implementing the pseudo-time sub- 

iteration algorithm described in equation (8) then 
yields for the p th stage of the RK T operator, the 
expression 

(1 + a p R)g p = g° + a p Rg p ~ 1 
- a p L~ r \[Rg p ~ l + a kk \ ^g p ~ 1 ], ( 9 ) 

with 

<j> = i sin($) + — sin(#/2) 4 

L~ r l = (1 + 4 7 sin (e/2ff\ 


and 

9 P 


n i ^ — a,i ^ — a kk 

U u At 


aAr 

Ax 


The variable E is eliminated from consideration in 
the stability analysis because it is constant during 
the pseudo-time sub-iteration and therefore has no 
influence on the stability. The diagonal coefficient 
a kk is the same on each stage of an ESDIRK, which 
yields an identical stability analysis for each stage. 
(A stability analysis for the BDF scheme given by 
equation (2) yields a similar result, with the variable 
a kk replaced by /3 k .) 

Several observations about equation (9) are in- 
structive. The independent parameters are 7: the 
level of implicit residual smoothing, R: the ratio of 
time-steps, and A: the CFL condition for the pseudo- 
time-stepping scheme multiplied by the diagonal co- 
efficient from the ESDIRK scheme. In general the 
parameter R satisfies the condition 0 < R < 1 . The 
limit R — > 0 or At — y 00 reduces to the steady 
state formulation. The limit R — > 1 corresponds 
to a physical time-step for which an explicit method 
would be stable, a situation that is unlikely to occur 


in high Reynolds number calculations. 

An exhaustive study in the parameters 7, R , A 
yields the following general conclusions. The resid- 
ual smoothing parameter 7 is most productive when 
the effective stability limit of the sub-iteration is in- 
creased by a factor 2 — 3. Parameter values in the 
range 1.0 < 7 < 1.5 produce this increase. Figures 
(l)-(3) show a comparison of the damping charac- 
teristics as a function of A, with the value 7 = 1.5. 
Shown are the cases R = 0 , R = and R = 1, 
with 1 < A < 7 varying on each plot. Plots of 
R < yg are virtually indistinguishable from the case 
R = 0. The three plots reveal that the pseudo-time 
sub-iteration is stable for A ss 7 , and has the best 
damping characteristics in the § < 0 < 7r range for 
values of A near the stability limit. 

The parametric study yields the following algo- 



FourierMode 

Figure 1. Damping characteristics for the case R = 
% = 0, with “CFL” = A. 

rithm for the pseudo-time sub-iteration. The RK T 
scheme is implemented in five stages with the co- 
efficients being defined by a = [1/4, 1/6, 3/8, 1/2,1]. 
Three evaluations of artificial dissipation terms 
(computed at the odd stages) are used to obtain a 
larger stability bound, which allows a higher CFL 
number in the presence of physical diffusion terms. 
The stability limit of the numerical method is fur- 
ther increased with the use of the implicit residual 
smoothing technique that employs grid aspect-ratio- 
dependent coefficients 18 and local time-stepping is 
used in each cell. The efficiency of the solution pro- 
cess is significantly enhanced through the use of a 
multigrid acceleration technique. For a detailed de- 
scription of the resulting iterative method see Vatsa 
et al. 18 The same concept for the computation of 


5 




Figure 2. Damping characteristics for the case R = 
% = with “CFL ” = A. 

unsteady flows is used by other researchers, see for 
example Martinelli 10 and Swanson and Turkel 16 . 

Diagonal Coefficients 

The pseudo-time sub-iteration is strongly influ- 
enced by the diagonal coefficients akk and /?*. In 
TLNS3D the pseudo-time sub-iteration is always ad- 
vanced with the maximum allowable scaled pseudo- 
time-step A = a/jfcA ss 7. The rate of relaxation in 
non — scaled pseudo-time is therefore, inversely pro- 
portional to the diagonal coefficient akk'- the smaller 
the value of a.kk the more rapidly the pseudo-time 
sub-iteration progresses. The values akk and pk vary 
significantly for different physical time-advancement, 
schemes. Table (1) presents the diagonal coefficients 
Pk for the BDF schemes, where the coefficients vary 
by approximately a factor of 2. Table (2) presents 
the diagonal contribution of the ESDIRK schemes, 
as well as the number of stages, the implicit stages 
and the order. The a-kk coefficients vary by ap- 
proximately a factor of two, and are generally much 
smaller than the BDF pk values. 

Unlike the BDF schemes, the ESDIRK schemes 
can be optimized to improve their efficiency. The 
important parameters are the diagonal coefficient 
a.kk and the number of stages s. A fortuitous trend 
observed with the ESDIRK schemes is a general de- 
crease in the value of a.kk with increasing stage num- 
ber, at a given accuracy and L-st ability property. 
Increasing the number of implicit stages does not 
always decrease the efficiency of the scheme, and 
sometimes yields greater efficiency. Fifteen ESDIRK 
schemes ranging from 3 to 8 stages were investigated 
to determine good choices of third-, fourth-, and 



Figure 3. Damping characteristics for the case R = 
% = 1, with “CFL" = A. 

fifth-order accuracy. The ESDIRK3, ESDIRK4, and 
ESDIRK5, used in this study (as well as the work of 
Kennedy 9 ) are representative candidates from this 
study of ESDIRK schemes. The value akk = \ 
in the ESDIRK4 scheme is an example of five im- 
plicit stages producing a more efficient fourth-order 
scheme than do four. This is consistent with the 
findings of other investigators (see Hairer 7 ). 


Method 

Stgs 

Impl Stgs 

Order 

akk 

ESDIRK3 

4 

3 

3 

0.435 

ESDIRK4 

6 

5 

4 

0.250 

ESDIRK5 

7 

6 

5 

0.184 


Table 2. Properties of the ESDIRK methods. 

Validation of the Space-Time Method 


The accuracy of the space-time integration meth- 
ods is investigated for an unsteady laminar flow test 
case. The test problem is laminar flow around a two- 
dimensional circular cylinder at a Reynolds number 
of 1200 and a Mach number of 0.3. The initial flow 
is symmetric with zero lift. As the wake behind the 
cylinder starts to grow, it becomes unstable and be- 
gins to shed from alternate sides of the cylinder. De- 
tailed numerical and experimental investigations of 
this flow have been performed by several authors 
3, 5, 14, 16 The computational grid of 97 x 65 is 
shown in Figure 4. The boundary is a distance of 
20 times the diameter of the cylinder away from the 
wall, while the distance between the wall and the 
first grid point is 0.001 times the diameter of the 
cylinder. Grid points are clustered in the wake. A 
density contour plot is shown in Figure 5 as calcu- 
lated on the 97 X 65 grid. Note that the near wall 
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vortical structures appear to be sufficiently resolved, 
but that resolution is lost as the grid expands in the 
far-field. In these preliminary calculation, a small 
time-step was chosen, so that the dominant compo- 
nent of error is the spatial contribution. 



Figure 4. The 97 x 65 0-Grid used in the circular 
cylinder study. 


Figure 6 shows a comparison of the lift history for 
one shedding cycle as calculated on the 97 x 65 and 
193 x 129 grids. The 193 x 129 data is shifted so 
that the first zero in lift coincides on both curves. 
The Strouhal number in each case was 0.2489, and 
0.2459 as calculated on the 97 x 65 and 193 x 129 
grids, respectively. Small differences in the lift are 
seen near the peaks of the cycle. These values are 
larger than 0.21, reported from experiments 4> 14 . 
As Mittal and Balachander report 13 this might be 
caused by the onset of three- dimensional effects, 
not captured in our two- dimensional computations. 
Two-dimensional computations performed by other 
researchers 5> 17 resulted in larger Strouhal numbers 
too, in the range of 0.23-0.24. This study and a more 
exhaustive study of inviscid vortex propagation sup- 
port the conclusion that the space-time operator is 
working properly, and that the spatial operator con- 
verges at the design spatial accuracy. The coarse 
grid (97 x 65) provides adequate spatial resolution 
to capture the relevant large scales features in the 
shedding process. As such, the 97 x 65 grid is cho- 
sen as the basis of most of the temporal refinement 
studies presented in this work. 



Figure 5. The density contours as calculated on the 
97 x 65 O- Grid. 

Temporal Accuracy 

A temporal refinement study is performed to as- 
sess the accuracy of various ESDIRK and BDF 
schemes. The initial condition for the study was 
obtained by simulating the limit cycle behavior of 
the flow for approximately 20 shedding cycles, with 
a relatively small time-step (At = ^). After 20 
cycles, the solution was stored in a restart file for 
use as the initial condition in the subsequent stud- 
ies. A classical temporal study was then performed 
from this initial condition. A typical practitioner is 
interested in lift, drag, pitching moment, skin fric- 
tion, and the frequency spectrum over several cy- 
cles. Thus, the time interval of the study includes 
approximately l| shedding cycles. This interval is 
sufficient in length to allow accumulation of tempo- 
ral error during the shedding cycle. No exact solu- 
tion is know for this problem, so a “numerical exact” 
solution was obtained using a small time-step and 
a small iteration tolerance on the non-linear solves. 
The “exact” time-step used was At = 0.05, and a 
stopping criterion for the iterative solve of the im- 
plicit equations, was max(residual) < 10~ 6 . The 
“exact” solution is accurate to approximately 6 sig- 
nificant digits in the lift. 

The lift on the body is used as the representative 
measure of error in all calculations. Other integral 
measures including drag, skin friction, total drag, 
pitching moment, as well as L 2 and L 0 0 norms over 
the domain were studied. All integral measures yield 
nearly the same quantitative conclusions, although 
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Figure 6. Comparison of lift over one shedding cycle 
between coarse and fine grids 

the qualitative errors are different for each case. A 
detailed investigation of the location of the maxi- 
mum temporal error revealed that the vortex gen- 
eration process in the near-wall regions is the most 
temporal demanding portion of the cycle. It is not 
surprising, therefore, that the lift integral is a good 
measure of total error in the calculation. 

Detailed results from the study are now presented. 
Figure 7 shows a detailed refinement study with an 
ESDIRK 4 scheme. Shown are the solution errors in 
the lift, viscous drag, total drag, and pitching mo- 
ment, as a function of logarithm of time-step. In 
all cases, the nonlinear system is solved to strict 
tolerances to eliminate “iteration” error as a con- 
taminating variable in the study. At coarse time- 
steps the solution accuracy deteriorates away from 
design accuracy. For sufficiently small time-steps, 
design accuracy is clearly demonstrated in all vari- 
ables. For example, a least-squares fit of the finest 
four data points on each curve reveals that the con- 
vergence rates for [lift, drag V, drag T, pitch] are 
[3.9628, 4.0257, 4.0478, 4.0251]. The coarsest data 
point corresponds to a time-step of At = 2, for which 
twelve points resolve the shedding cycle. Note that 
15 — 20 points are needed to resolve the cycle be- 
fore the fourth-order scheme converges with design 
accuracy. This resolution is consistent with conven- 
tional estimates of “points-per-wavelength” needed 
to resolve a periodic wave. 

Figure 8 presents the error in lift versus the time 
step (log-log) for three BDF schemes and three ES- 
DIRK schemes. The ESDIRK schemes presented in 
this figure are summarized in table (2). Again, the 



Figure 7. Convergence behavior for lift, drag and 
pitching moment as calculated with a fourth-order 
ESDIRK scheme. 

nonlinear equations at each stage (step) are solved to 
tolerances which are small compared with the abso- 
lute error in the calculation. There is a dramatic in- 
crease in accuracy in going from BDF1 to ESDIRK5. 
For example, an error tolerance of 10 -1 is achieve 
with a time-step of At = 1 for the ESDIRK4 and 
ESDIRK5 schemes, while the BDF1 and BDF2 re- 
quire At = 10~ 2 and At = 10 — 1 , respectively. 
The BDF3 and the ESDIRK3 schemes have nearly 
the same absolute level of error, although the con- 
vergence behavior of the BDF3 scheme is sporadic. 
No explanation of this behavior was identified, al- 
though a possible explanation is the schemes lack of 
A-stability. 

As mentioned previously in this work, the BDF2 
scheme is consistently used by practitioners because 
of its robustness, simplicity and efficiency. The re- 
sults in Figure 8 clearly show that the ESD1RK4 
scheme can be used at time-steps which are a fac- 
tor of ten larger than those used in the BDF2, while 
achieving similar accuracy. At fine tolerances, this 
difference becomes even greater. Figure 8 still can 
not be used to conclude that the ESD1RK4 scheme 
is more efficient than the BDF2 scheme. To do so 
requires a detailed accounting of the work involved 
in each algorithm, and is not a simple task. 

Temporal efficiency 

For large computations, the work involved in ad- 
vancing the solution from t = To to t = Tf is 
proportional to the number of non-linear solves re- 
quired over that interval, but also depends strongly 
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Figure 8. Convergence as a function of time-step for 
BDF and ESDIRK schemes. 


Figure 9. Convergence as a function of work for 
BDF and ESDIRK schemes. 


on how quickly each non-linear solve converges. The 
number of non-linear solves is simply Nl = I s , 

where I s is the number of implicit stages per time- 
step. (An explicit stage is virtually "free”: hence the 
use of the ESDIRK instead of the SDIRK schemes.) 
For the BDF schemes I s = 1. A much more dif- 
ficult aspect to predict, however, is how rapidly a 
nonlinear solve will converge. Smaller physical time- 
steps provide a much better initial guess for the non- 
linear iteration, which implies an advantage for the 
BDF schemes. The ESDIRK schemes, however, have 
a much smaller diagonal coefficient a/-k which in- 
creases the asymptotic convergence rate of the multi- 
grid process. 

Figure 9 presents the convergence of the six 
schemes as a function of the required work. Three 
accuracy levels are chosen, [10 -1 , 10 -2 , 10 -3 ] as 
representative of desired engineering accuracy lev- 
els. The appropriate values of At needed for each 
method are obtained from Figure 8. An iteration 
tolerance a factor of 200 times smaller than the de- 
sired error level, is used for the nonlinear iteration, 
with the error based on the Too norm of the resid- 
ual. For example, if the desired error is 10 -2 then at 
each stage (step) the nonlinear system is solved until 
the maximum residual is • The work from each 
method is measured as the total number of multi- 
grid cycles used in the entire time interval. 

An obvious conclusion from the study presented 
in Figure 9 is that the BDF1 scheme (Euler Implicit) 
will never compete with the higher-order schemes in 
terms of efficiency. Similarly, the BDF2 scheme is 
only competitive with the third-, fourth- and fifth- 


order scheme at extremely coarse tolerances. For 
example, the BDF2 is 2.5 times less efficient than 
the ESDIRK4 scheme at an error tolerance of 10% 
(one significant digit in the solution accuracy). As 
the error tolerance becomes more strict, the high- 
order schemes easily outperform the BDF1 and the 
BDF2 scheme. For a desired temporal error of 1% in 
lift the fourth-order integration method ESDIRK4, 
only requires 1.5% of the work required by BDF1, 
an efficiency increase with a factor 70. For a tem- 
poral error of 0.1% solution ESDIRK4 requires 10% 
of the work of BDF2, an efficiency increase with a 
factor 10. Note again that the BDF3 scheme shows 
an irregular behavior. Perhaps this is due to the fact 
that the BDF3 solution oscillates around the exact 
solution for different time-steps, and sometimes by 
coincidence yields unusually low levels of error. 

It is interesting to note that the fourth- and 
fifth-order ESDIRK schemes have exactly the same 
accuracy-work ratio, and that their convergence be- 
havior appears to be logarithmic in nature. Both 
have different time-steps, stages, and diagonal coef- 
ficients a.fcfc, and there is no reason why they should 
lie on the same line, or have logarithmic convergence 
behavior. Though both have the same efficiency, the 
ESDIRK5 scheme requires more storage and is less 
robust than the ESDIRK4 scheme (internal stabil- 
ity problems at huge time-steps). Therefore, for an 
efficient and robust solution in time the ESDIRK4 
scheme is recommended. 

A major contributor to the inefficiency of implicit 
methods is solving the nonlinear systems at each 
stage (step) to inappropriate sub-iteration tolerance 
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Figure 10. The effects of sub-iteration tolerance 
on solution accuracy for the BDF3 and ESDIRKf 
schemes. 


levels. If the nonlinear solve is iterated too many 
times, the additional work does not increase the so- 
lution accuracy or robustness, but does increase the 
cost. If the nonlinear solve is not iterated enough, 
then the solution error will be dominated by the er- 
rors in the nonlinear solve, robustness will suffer, 
and the entire solution will be in jeopardy. The 
solution cannot be more accurate than the errors 
in the nonlinear equations. Unfortunately, it is not 
known a-priori to what levels the nonlinear equa- 
tions should be solved. A final study is performed, 
in which the nonlinear equations are solved to sub- 
iteration levels which were [555 , , 5] of the desired 

error level. The error tolerances used in the previous 
study [1CT 1 , 10~ 2 , 10 -3 ] are again used. 

Figure 10 shows a plot of the solution accuracies 
for the BDF3 and the ESDIRK4 schemes. Ideally, 
increasing the sub-iteration tolerance level, should 
move the curves uniformly to smaller values of work, 
until at a critical tolerance level the solution accu- 
racy begins to deteriorate. The sub-iteration toler- 
ances of A- and yield essentially the same ac- 
curacy levels, but significantly differ in the amount 
of work required. For a sub-iteration tolerance of 
| the solution accuracy begins to degrade for both 
schemes. Similar results are exhibited with the other 
ESDIRK schemes. It is concluded that the nonlin- 
ear sub-iteration in TLNS3D should be converged 
to a level which is at least Aj of the desired solution 
accuracy, independent of the temporal integration 
method used. 


Figure 11. Cycle variation of predicted temporal er- 
ror as calcidated with ESDIRK4 

Temporal error predictor 

An efficient time advancement scheme is capable 
of monitoring the solution error, and adjusting the 
time-step when needed. The ESDIRK schemes excel 
in the variable time-stepping environment because 
they are self starting. At each time-step a reliable 
measure of solution error is needed, however. All 
the ESDIRK schemes used in this study have em- 
bedded schemes available to monitor the solution er- 
ror. Some norm of the solution error is obtained at 
each time-step by comparing the main and embed- 
ded solutions U and U. The main and embedded 
solution differ in accuracy by one order. Thus, the 
difference between the solutions is the leading order 
error term in the temporal Taylor series expansion, 
and is proportional to the time-step error. Knowing 
the solution error at each time-step, the subsequent 
time-step is adjusted to reflect a desired accuracy 
tolerance for the calculation. An additional bene- 
fit. of the error estimate is that a precise stopping 
criterion for the nonlinear sub-iteration can be im- 
plemented. 

Figure 11 shows the predicted temporal error as 
calculated by the ESDIRK4 scheme. The L 2 and 
are presented for the coarse grid case shown in Fig- 
ure 6. A fixed time-step of At = ^ was used with a 
nonlinear sub-iteration tolerance of 0.5 x 10~ 5 . The 
temporal error correlates highly with the maximum 
and minimum lift. Note that about | an order vari- 
ation in temporal error is observed over the shedding 
cycle. Calculations were successfully performed on 
this case in variable time-stepping mode using a con- 
troller (see Kennedy 9 for details). 
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Work continues into automating the time-step 
controller. The objective is automation such that 
the only temporal input is the time-step error. Many 
additional cases must be run in the future, including 
fully turbulent flows, to calibrate the controller. 

Conclusions 

The accuracy and efficiency of several time inte- 
gration schemes has been investigated for the un- 
steady Navier-Stokes equations. Time is discretized 
implicitly, while the spatial discretization is a con- 
ventional cell-centered finite volume scheme with ar- 
tificial dissipation added for stability. The nonlin- 
ear equations are solved at each step with a multi- 
grid algorithm. This study focuses on the efficiency 
of higher-order Runge-Kutta schemes in compari- 
son with the popular Backward Differencing For- 
mulations. For this comparison an unsteady two- 
dimensional laminar flow problem was chosen, i.e. 
flow around a circular cylinder at Re=1200. It is 
concluded that for all realistic error levels (smaller 
than 10 -1 ) fourth- and fifth-order Runge Kutta 
schemes are the most efficient. For reasons of robust- 
ness and computer storage, the fourth-order Runge- 
Kutta method is recommended. The efficiency of the 
fourth-order Runge-Kutta scheme exceeds that of 
second-order Backward Difference Formula (BDF'2) 
by a factor of 2.5 at engineering error tolerance lev- 
els (10 _1 -1(T 2 ). Efficiency gains are more dramatic 
at smaller tolerances. 
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An 

pendix 

The Butcher variables A, 

B, and C for the ES- 

DIRK4 scheme are 

(with 06 j 

= b j) 

0*21 
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0*31 
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0*33 

a 4 i 
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0*43 

«44 
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0*62 
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i>i 

b 2 

b 3 

h 4 
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C2 

03 
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C 5 
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1 1 8611 -1743 
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4 
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4 
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u 
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1 

83 
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